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1.  Cover  Sheet:  Attached. 

2.  Objectives:  The  objective  of  this  project  was  to  develop  steganalysis  techniques  for 
images  that  have  been  potentially  subjected  to  a  watermarking  algorithm.  Our  effort  was 
directed  towards  two  separate  research  areas:  Detection  of  robust  watermarking 
techniques  using  quality  measures  and  detection  of  fragile  authentication  watermarks. 

In  the  first  part  of  the  proposed  research  our  objective  was  to  develop  universal 
steganalysis  techniques  for  identifying  the  presence  of  robust  digital  watermarks  by  using 
image  quality  metrics  to  detect  artifacts  induced  into  the  image  by  the  watermarking 
process.  Specific  objectives  included: 

1.  Identification  of  image  quality  features  that  best  aid  in  differentiating  between 
watermarked  and  non-watermarked  images 

2.  The  development  of  classification  techniques  that  separate  such  images  in  the 
selected  feature  space. 

In  the  second  part  of  the  proposed  research  we  focused  on  detection  of  images 
watermarked  with  authentication  watermarks  for  image  integrity  protection.  Because 
authentication  watermarks  are  almost  always  very  weak  signals,  the  methodology  cannot 
be  based  on  quality  metrics. 

The  project  was  done  as  a  collaborative  effort  between  the  PI  Nasir  Memon  at 
Polytechnic  University  and  the  Co-PI  Jiri  Fridrich  at  SUNY  Binghamton.  Project 
deliverables  include  a  working  prototype  software  for  steganalysis,  implemented  on  a  PC 
platform  that  was  delivered  to  AFRL  for  experimentation. 


3.  Status  of  Effort 


Our  objectives  towards  development  of  steganalysis  techniques  based  on  image  quality 
metrics  were  more  than  met.  We  conducted  extensive  experimentation  with  a  large  image 
data  set  to  identify  image  quality  features  that  best  aid  in  differentiating  between 
watermarked  and  non-watermarked  images.  We  then  developed  classification  techniques 
that  separate  such  images  in  the  selected  feature  space.  Results  were  published  in  two 
conference  papers  and  one  journal  paper. 

Our  work  has  made  good  progress  in  taking  us  towards  developing  universal  steganalysis 
techniques  that  can  distinguish  between  cover  objects  and  stego  objects.  It  representing 
the  first  published  effort  that  such  a  technique  is  indeed  possible.  However,  we  need 
further  improvements  in  the  false  positive  and  miss  rates  to  make  the  techniques  more 
robust  and  reliable. 

Besides  steganalysis  based  on  image  quality  metrics,  we  also  developed  a  steganalysis 
technique  based  on  binary  similarity  metrics.  This  gave  results  comparable  to  those 
obtained  by  image  quality  metrics  but  at  a  significantly  reduced  cost  in  terms  of 
computational  complexity. 

Our  work  in  steganalysis  led  us  top  the  question  as  to  how  much  data  can  be  embedded  in 
an  image  without  it  being  reliably  detected?  We  formulated  the  problem  in  a  theoretic 
setting  and  have  given  some  preliminary  answers.  These  answers  provide  some  insight  on 
the  fundamental  bounds  of  steganalysis.  We  are  currently  studying  the  problem  further  to 
better  understand  these  bounds. 


4.  Summary  of  Achievements 

We  made  progress  on  several  fronts  in  the  project.  Below  we  itemize  these  achievements 
by  topic  and  summarize  the  main  results  obtained.  More  detailed  results  are  in  the  papers 
listed  in  the  publications  section. 

1.  Image  Quality  Metric  (IQM)  Based  Steganalysis:  Our  earlier  preliminary  work  had 
suggested  that  a  particular  watermarking  scheme  leaves  statistical  evidence  or 
structure  that  can  be  exploited  for  detection  with  the  aid  of  proper  selection  of  image 
features  and  multivariate  regression  analysis.  In  this  work,  we  identified  some 
sophisticated  image  quality  metrics  as  the  feature  set  to  distinguish  between 
watermarked  and  unwatermarked  images.  To  identify  specific  quality  measures, 
which  provide  the  best  discriminative  power,  we  used  analysis  of  variance  (ANOVA) 
techniques.  We  conducted  extensive  experiments  with  different  feature  sets  and 
classification  techniques  using  well-known  and  commercially  available  watermarking 
techniques.  The  results  obtained  validate  our  approach  and  we  were  able  to 
distinguish  between  watermarked  and  unwatermarked  images  with  moderate 
accuracy.  However,  a  significant  amount  of  further  experimental  work  and 
mathematical  analysis  is  needed  before  we  get  a  better  understanding  about  the  nature 


of  artifacts  caused  by  watermarking  and  the  best  means  to  exploit  this  knowledge  for 
the  purpose  of  steganalysis.  Initial  results  of  our  technique  were  presented  in  [19]. 
Additional  results  showing  the  ability  to  detect  unknown  steganographic  algorithm 
was  presented  in  [17].  Finally,  these  results  were  consolidated  in  one  journal  paper  in 
[1]. 

2.  Binary  Similarity  Measure  Based  Steganalysis:  One  of  the  limitations  of  IQM  based 
steganalysis  was  the  computation  time  needed.  Hence  we  started  looking  at 
computationally  simpler  approached.  Our  efforts  led  us  to  the  development  of  a  novel 
technique  for  that  employs  the  seventh  and  eight  LSB’s  in  an  image  to  compute  a  set 
of  binary  similarity  measures.  The  basic  idea  is  that,  there  must  be  more  correlation 
in  these  bits  in  a  clean  image  than  in  a  stego-image,  as  the  8th  bit  in  a  stego  image  is 
relatively  random.  The  steganalyzer,  that  is,  the  marked  -  non-marked  classifier,  is 
built  using  multivariate  regression  on  the  set  of  computed  binary  similarity  measures. 
Simulation  results  with  a  set  of  images  and  well-known  LSB  type  steganographic 
techniques  indicate  that  the  new  steganalyzer  provides  promising  results.  One  way  to 
interpret  this  steganalyzer  is  that  it  uses  binary  similarity  measures  as  image  quality 
metrics  for  the  purpose  of  steganalysis.  Since  these  are  simpler  to  compute,  it  leads  to 
a  more  efficient  IQM  based  steganalysis  technique.  A  conference  paper  describing 
our  approach  and  the  results  was  written  [10].  A  journal  paper  is  currently  under 
preparation. 

3.  LSB  Steganalysis  for  Images:  We  also  developed  another  LSB  steganalysis 
technique  that  can  detect  the  existence  of  hidden  messages  that  are  randomly 
embedded  in  the  least  significant  bits  of  natural  continuous-tone  images.  The 
technique  is  inspired  by  the  RS-Steganalysis  technique  of  et  al.  and  just  like  RS- 
Steganalysis,  it  can  also  precisely  measure  the  length  of  the  embedded  message,  even 
when  the  hidden  message  is  very  short  relative  to  the  image  size.  The  key  to  our 
success  is  the  formation  of  some  subsets  of  pixels  whose  cardinalities  change  with 
LSB  embedding,  and  such  changes  can  be  precisely  quantified  under  the  assumption 
that  the  embedded  bits  are  randomly  scattered.  Interestingly,  our  study  on 
steganalysis  of  LSB  embedding  sheds  light  on  the  recent  work  of  Fridrich  et  al.  on  the 
detection  of  LSB  embedding,  and  offers  an  analytical  proof  of  an  observation  made 
by  them.  A  preliminary  paper  describing  our  approach  was  presented  at  ICIP  [11]. 
Additional  investigation  using  this  approach  is  being  currently  carried  out.  This  part 
of  the  research  was  not  originally  planned  in  the  proposal  but  just  developed  due  to 
our  activity  in  steganography  and  discussion  with  other  researchers,  namely  Xiaolin 
Wu  and  his  post-doctoral  student  Sorina  Dumetrescu. 

4.  Mathematical  Analysis  of  Steganographic  Capacity:  During  our  research  it 
occurred  to  us  that  although  there  have  been  many  techniques  for  hiding  messages  in 
images,  there  has  been  little  mathematical  analysis  establishing  their  statistical 
indistinguishibility  from  cover  images.  Hence  we  started  looking  at  some  specific 
image  based  steganography  techniques  and  derived  a  closed  form  expression  of  the 
probability  of  false  detection  in  terms  of  the  number  of  bits  that  are  hidden.  This  led 
us  to  the  notion  of  steganographic  capacity,  that  is,  how  many  bits  can  we  hide  in  a 
message  without  causing  statistically  significant  modifications?  Our  results  are  able 


to  provide  an  upper  bound  on  this  capacity.  This  work  was  done  in  collaboration  with 
R.  Chandramouli  of  Stevens  Institute  and  published  in  ICEP  2001  [8].  We  believe  this 
is  a  promising  area  of  work  and  will  improve  our  mathematical  understanding  of 
steganography  and  steganalysis.  Again,  this  was  work  not  originally  planned  in  the 
proposal  but  developed  as  a  by  product  of  our  activity  in  steganography. 

5.  Audio  Steganalysis:  We  had  always  maintained  our  quality  metric  based 
steganalysis  techniques  for  images  were  equally  applicable  to  audio  and  video,  with 
appropriate  selection  of  corresponding  quality  measures.  A  PhD  student  of  our  NSF 
collaborator  on  this  project,  B.  Sankur  has  started  looking  at  statistical  methods  to 
detect  the  presence  of  hidden  messages  in  audio  signals.  Initial  experimental  results 
show  that  the  proposed  technique  can  be  used  to  detect  the  presence  of  hidden 
messages  in  digital  audio  data.  These  results  were  published  in  [7]. 

6.  Covert  Channels  by  Data  Masking:  While  pursuing  our  investigation  into 
steganography  and  steganalysis,  it  occurred  to  us  that  the  basic  model  of 
steganography  can  be  turned  on  its  head  to  produce  interesting  alternative  techniques. 
Steganography  strives  to  embed  a  secret  message  into  a  cover  object  in  such  way  that 
the  cover  object  and  stego  object  are  statistically  and  perceptually  indistinguishable. 
However,  in  any  automated  system,  perceptual  tests  are  not  practical  and  only 
statistical  tests  will  be  carried  out.  This  means  that  given  a  secret  message,  we  could 
“massage”  it  into  a  stego  object  that  looks  like  to  cover  object  from  a  statistical  point 
of  view  and  not  from  a  perceptual  point  if  view.  Using  audio  stego  objects  we  could 
show  that  an  order  of  magnitude  additional  bits  can  Results  were  published  in  [9]. 

7.  Other  Work:  In  addition  to  steganalysis  also  worked  on  related  problems.  These 
include: 

o  Image  Authentication:  We  analyzed  a  well  known  robust  hash  function 
for  image  data  called  the  Visual  Hash  Function  (VHF)  and  showed  that  it 
is  susceptible  to  attacks.  Given  just  an  input  and  its  hash  value,  we  showed 
how  to  construct  a  statistical  model  of  the  hash  function,  without  any 
knowledge  of  the  secret  key  used  to  compute  the  hash.  This  model  can 
then  be  used  to  engineer  arbitrary  and  malicious  collisions.  We  proposed 
a  possible  modification  to  VHF  so  that  constructing  a  model  that  mimics 
its  behavior  becomes  difficult.  Results  were  published  in  [6], 

o  Image  Reassembly:  Reassembly  of  fragmented  objects  from  a  collection 
of  randomly  mixed  fragments  is  a  common  problem  in  classical  forensics. 
We  addressed  the  digital  forensic  equivalent,  i.e.,  reassembly  of  document 
fragments,  using  statistical  modeling  tools  applied  in  data  compression. 
We  showed  how  we  can  recover  images  and  documents,  only  from  their 
pieces.  Results  were  published  in  [12]. 


o  Watermarking  Protocols:  In  [4]  we  presented  a  watermarking  that 
enables  Alice  to  demonstrate  the  presence  of  watermark  to  Bob  without 
revealing  the  watermark.  This  is  a  difficult  problem  and  although  our 
proposed  protocol  was  not  a  zero-knowledge  protocol,  it  was  the  first  step 
in  the  future  development  of  such  a  protocol. 
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AFRL.  Possibilities  were  discussed  during  our  meetings.] 
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